Papers

Gemma: Open Models Based on Gemini Research and Technology

Google / Google DeepMind

Published on: 2024-04-16 1 author
ChipNeMo: Domain-Adapted LLMs for Chip Design

NVIDIA

Published on: 2024-04-04 1 author
mPLUG-Owl : Modularization Empowers Large Language Models with Multimodality

Alibaba

Published on: 2024-03-29 1 author
Word Importance Explains How Prompts Affect Language Model Outputs

DataRobot

Published on: 2024-03-05 1 author
SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models

Intuit

Published on: 2024-03-04 1 author
GPT-4 Technical Report

OpenAI

Published on: 2024-03-04 1 author
The Claude 3 Model Family: Opus, Sonnet, and Haiku

Anthropic

Published on: 2024-03-04 1 author
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

Snap

Published on: 2024-02-22 1 author
SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency

Intuit / Vanderbilt University

Published on: 2024-02-18 1 author
DINOv2: Learning Robust Visual Features without Supervision

Meta Platforms / Meta AI Research

Published on: 2024-02-02 1 author
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Anthropic

Published on: 2024-01-17 1 author
Surgical-DINO: Adapter Learning of Foundation Models for Depth Estimation in Endoscopic Surgery

Intuit / Vanderbilt University

Published on: 2024-01-12 1 author
Mixtral of Experts

Mistral AI

Published on: 2024-01-08 16 authors
Autonomous Procedural Operations (ProcOps

Honeywell International

Published on: 2024-01-04 1 author
Speech Translation with Large Language Models: An Industrial Practice

ByteDance

Published on: 2023-12-21 1 author
VideoPoet: A Large Language Model for Zero-Shot Video Generation

Google

Published on: 2023-12-21 8 authors
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision

OpenAI

Published on: 2023-12-14 1 author
Knowledge Diffusion for Distillation

Snap / The University of Sydney

Published on: 2023-12-04 1 author
Interactive Multi-fidelity Learning for Cost-effective Adaptation of Language Model with Sparse Human Supervision

Intuit / Vanderbilt University

Published on: 2023-10-31 1 author
Diffusion Models Without Attention

Apple / Cornell University

Published on: 2023-10-30 1 author
Towards Making the Most of ChatGPT for Machine Translation

Microsoft / Beihang University

Published on: 2023-10-20 1 author
VOYAGER: An Open-Ended Embodied Agent with Large Language Models

NVIDIA / Caltech

Published on: 2023-10-19
Eureka: Human-Level Reward Design via Coding Large Language Models

NVIDIA

Published on: 2023-10-19
BitNet: Scaling 1-bit Transformers for Large Language Models

Microsoft / Microsoft Research

Published on: 2023-10-17 1 author
SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

Snap / Northeastern University, China

Published on: 2023-10-16 1 author
CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules

Salesforce

Published on: 2023-10-13 1 author
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond

Alibaba

Published on: 2023-10-13 1 author
Solving Olympiad Geometry without Human Demonstrations (AlphaGeometry)

Google

Published on: 2023-10-13 1 author
Deep Industrial Image Anomaly Detection: A Survey

Honeywell International / University of Surrey

Published on: 2023-10-12 1 author
Mistral 7B

Mistral AI

Published on: 2023-10-10 1 author
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration

Alibaba

Published on: 2023-10-09 1 author
SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF

NVIDIA

Published on: 2023-10-09 1 author
LRM: Large Reconstruction Model for Single Image to 3D

Adobe / Australian National Univeristy

Published on: 2023-10-08 1 author
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Microsoft / Microsoft Research

Published on: 2023-10-03 1 author
Qwen Technical Report

Alibaba

Published on: 2023-09-28 1 author
Qwen Technical Report

Alibaba

Published on: 2023-09-28 4 authors
Connecting Speech Encoder and Large Language Model for ASR

ByteDance / Tsinghua University

Published on: 2023-09-26 1 author
SWE-Agent: Agent-Computer Interfaces Enable Automated Software Engineering

Princeton University

Published on: 2023-09-14 4 authors
PaLM 2 Technical Report

Google

Published on: 2023-09-13 1 author
Phi-2: A Small Language Model with Reasoning Capability

Microsoft

Published on: 2023-09-12 5 authors
Efficient Memory Management for Large Language Model Serving with PagedAttention

Stanford University, University of California, Berkeley

Published on: 2023-09-11 9 authors
Textbooks Are All You Need

Microsoft / Microsoft Research

Published on: 2023-09-02 1 author
Graph of Thoughts: Solving Elaborate Problems with Large Language Models

Eidgenössische Technische Hochschule Zürich

Published on: 2023-08-18 6 authors
3D Gaussian Splatting for Real-Time Radiance Field Rendering

INRIA, Université Côte d’Azur

Published on: 2023-08-08 Venue: SIGGRAPH 2023 4 authors
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework

DeepWisdom / Tsinghua University

Published on: 2023-08-01 4 authors
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Google / Google DeepMind

Published on: 2023-07-28 1 author
Differentially Private Heavy Hitter Detection using Federated Analytics

Apple

Published on: 2023-07-21 1 author
Llama 2: Open Foundation and Fine-Tuned Chat Models

Meta Platforms

Published on: 2023-07-19 1 author
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

Salesforce

Published on: 2023-06-15 1 author
Neuralangelo: High-Fidelity Neural Surface Reconstruction

NVIDIA / Johns Hopkins University

Published on: 2023-06-05 1 author

Prev 1 2 3 4 5 6 7 8 9 Next

Search

Papers

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: